It improves the accuracy of attribute selection, overcomes the impact of noise data effectively and strengthened generalization ability of the decision tree. 既提高了属性选择的准确度又有效克服噪声数据的影响,使生成的决策树灵活泛化能力更强。
Research on Attribute Selection Algorithm Based on Analysis of Correlation between Attributes 基于属性间相关性分析的属性选择方法研究
Attribute Selection Method based on Minimum Description Length and Genetic Algorithm 基于最小描述长度和遗传算法的属性选择方法
Analysis and Research of Attribute Selection Methods in Data Mining 数据挖掘中属性选择算法的分析与研究
The Research on Attribute Selection of Node and Pruning of Decision-tree 决策树的结点属性选择和修剪方法研究
In order to solve the data scarcity problem of massive short text categorization, a semi-supervised short text categorization method based on attribute selection was presented. 针对海量短文本分类中的标注语料匮乏问题,提出了一种基于属性选择的半监督短文本分类算法。
CART, which uses GINI rule in attribute selection and induces a binary tree; 依据GINI系数寻找最佳分割并生成二叉决策树的CART算法;
We adopt a way of attribute selection based on word entropy, use vectors which are represented by word frequency, and deduce its corresponding Bayesian formula. 我们采用了基于词熵的特征项提取方法,并且使用特征项单词出现频率来表示向量,推导出相应的贝叶斯计算公式。
The abstract channel model for learning from examples is presented and a new attribute selection measure ( channel capacity) is introduced. 本文提出了示例学习的抽象信道模型,引入一个新的特征选择量&信道容量。
This paper presents a new attribute selection metric, called the gain-ratio criterion to replace the gain criterion. 该文对于构造入侵规则决策树的过程,采用信息增益率为新的分类属性选择标准,并用它替代了原有的信息增益标准。
Fast Attribute Selection Algorithm Based on Fractal Dimension 一种基于分形维的快速属性选择算法
Attribute Selection Based on Regularization Networks-Genetic Algorithm and Its Application in Chemical Pattern Recognition 基于正则化网络-遗传算法的属性筛选及其在化学模式识别中的应用
This paper puts forward s decision tree algorithm based on rough sets by using the concept of the approximate accuracy through attribute selection, which is simpler in the structure and can improve the efficiency of the classification. 基于粗糙集理论,利用近似精度的概念来进行属性选择,构造决策树,有效地提高了效率并降低了决策树的复杂度。
This paper discusses three common entropy-based attribute selection criteria of the traditional decision tree arithmetic, and presents a new decision tree building method based on attribute importance ranking. 讨论传统决策树算法中三种常用的基于熵的属性选择标准,提出一种基于属性重要性排序的建立决策树的新方法。
Comprehensive experiment research has been completed on the basis of restoring the network packets into connection records by protocol resolutions, to consider influences of partition of the training and test sets, attribute selection and time window size on anomaly detection model. 在将给定网络数据包文件解协为网络连接记录基础上,针对训练与测试数据集比例划分、分类属性选择和统计属性时间窗大小对网络异常检测模型的影响进行了全面的实验研究。
First, a method of data attribute importance ranking using input output correlation is presented in this paper. Then, attribute selection is carried out based on the rank of the attribute importance using RBF neural network. 该方法先用输入输出关联法对数据属性进行重要性排序,然后按重要次序用RBF神经网络进行属性选择。
The choice of attribute selection metric to split has an important impact on the shape and the depth of the resulting decision tree. 在根据入侵规则构造决策树时,所依据的分类属性选择标准对决策树的形状和深度有很大的影响。
A Mixing Algorithm for Feature Attribute Selection 一个混合特征属性选择算法
Combining the attribute selection method with rough analysis, a classifier modeling method is put forward, which owns the ability of deleting redundant attributes and deducing rules based on rough analysis. 将属性选择方法与Rough分析相结合,利用Rough分析可以剔除属性集合中冗余属性并进行规则归纳的能力,提出一种基于Rough分析的分类器建模算法。
Attribute selection is a process of selection a best subset ( according to some criteria) from the dataset. 属性选择(Attributeselection)就是一个从原有的属性集合中选择一个(相对某种评价准则)最优属性子集的过程。
The attribute selection metric of traditional heuristic algorithm was modified, so the new improved significance of an attribute was proposed. 然后对传统启发式方法中选择属性的标准进行改进,由此给出了新的属性重要性定义;
To solve these problems, this thesis studies these key technologies in security audit process: feature selection, discretization of continuous attribute, splitting attribute selection standard and decision tree pruning. Firstly, this thesis studies feature selection methods. 针对这些问题,本文对特征选择、连续属性离散化、分裂属性选择以及决策树剪枝这些安全审计过程中的关键技术进行了研究。
Then, an attribute selection method based on information gain and genetic algorithm is presented. Discussions on experimental results have shown its pros and cons. 进而,实现了一种基于信息增益和遗传算法结合的属性选择方法,并通过大量的实验分析,论述了这种方法存在的问题。
At the same time, when they are applied to select samples, it is obvious that both of them cut down the size of decision trees without sacrificing the accuracy. A new test attribute selection criteria based on modified coefficient is presented. 同时,用这两种方法对样本集进行筛选,都能在不损害分类准确率的同时减小决策树的规模。提出了一种基于修正系数的测试属性选择标准。
Based on the analysis and research of existing text preprocessing algorithms, we have improved the Gini index algorithm which has been used for attribute selection of a decision tree and use this improved algorithm to select text features. 本文在分析研究现有文本预处理算法优、缺点的基础上,对基尼指数方法进行改进,并将其用于文本的特征选择,有效地提高了分类器的分类性能。
The algorithm makes improvements at the two aspects of samples selection as well as test attribute selection. Besides, it also optimizes the main processes ( or steps) which are easily influenced by noises and always cause variety bias problems when building a decision tree. 该算法从样本筛选和测试属性选择标准方面进行了改进,对决策树建立过程中易受噪声影响和易产生多值偏向问题的主要环节进行了优化。
The former algorithm makes improvements from three aspects: discretization, reducing dimension, attribute selection, which effectively solves the conflict between efficiency and prediction precision. 前者从数据的离散化,降维,和属性选择方面有效的解决了处理大规模高维数据库时的效率与精度之间的矛盾。
Meanwhile, it focused on one of data mining research platforms-Weka, mainly on the analysis of design and implementation for attribute selection algorithms, and a detail analysis of attribute selection operations. 同时,结合数据挖掘研究平台Weka,分析了属性选择算法的设计与实现,深入剖析了属性选择算法的运行过程。
A method of attribute selection of software metrics based on information gain and adaptive genetic algorithm ( IG-AGA) is proposed. 提出了一种基于信息增益和自适应遗传算法的软件度量属性选择方法(IG-AGA)。
The second method IVPRSDT: This algorithm uses a new standard of attribute selection which considers comprehensively the classification accuracy and number of attribute values, that is, weighted roughness and complexity. 第二种属性选择方法,使用了一种综合考虑分类精度和分支数量的属性选择新标准&加权粗糙度和复杂度。